Dual Attention Based Image Pyramid Network for Object Detection

Xiang Dong; Feng Li; Huihui Bai; Yao Zhao

연구문헌

영문 논문지

홈 > 연구문헌 > 영문 논문지 > TIIS (한국인터넷정보학회)

TIIS (한국인터넷정보학회)

Current Result Document :

한글제목(Korean Title)	Dual Attention Based Image Pyramid Network for Object Detection
영문제목(English Title)	Dual Attention Based Image Pyramid Network for Object Detection
저자(Author)	Xiang Dong Feng Li Huihui Bai Yao Zhao
원문수록처(Citation)	VOL 15 NO. 12 PP. 4439 ~ 4455 (2021. 12)
한글내용 (Korean Abstract)
영문내용 (English Abstract)	Compared with two-stage object detection algorithms, one-stage algorithms provide a better trade-off between real-time performance and accuracy. However, these methods treat the intermediate features equally, which lacks the flexibility to emphasize meaningful information for classification and location. Besides, they ignore the interaction of contextual information from different scales, which is important for medium and small objects detection. To tackle these problems, we propose an image pyramid network based on dual attention mechanism (DAIPNet), which builds an image pyramid to enrich the spatial information while emphasizing multi-scale informative features based on dual attention mechanisms for one-stage object detection. Our framework utilizes a pre-trained backbone as standard detection network, where the designed image pyramid network (IPN) is used as auxiliary network to provide complementary information. Here, the dual attention mechanism is composed of the adaptive feature fusion module (AFFM) and the progressive attention fusion module (PAFM). AFFM is designed to automatically pay attention to the feature maps with different importance from the backbone and auxiliary network, while PAFM is utilized to adaptively learn the channel attentive information in the context transfer process. Furthermore, in the IPN, we build an image pyramid to extract scale-wise features from downsampled images of different scales, where the features are further fused at different states to enrich scale-wise information and learn more comprehensive feature representations. Experimental results are shown on MS COCO dataset. Our proposed detector with a 300×300 input achieves superior performance of 32.6% mAP on the MS COCO test-dev compared with state-of-the-art methods.
키워드(Keyword)	Dual attention mechanism Adaptive feature fusion module Progressive attention fusion module Image pyramid network Multi-scale object detection
파일첨부	PDF 다운로드